Lexicon Reduction Based on Global Features for On-line Handwriting
نویسنده
چکیده
Handwriting Recognition generally uses a dictionary : in the case of large vocabularies, our system helps checking the existence of the words provided by the recognizer, or when there is a model to go through per word, it will reduce the number of models to go through. In the case of smaller vocabularies it can be used as a recognizer. From the signal provided by a digitizing tablet we build a global silhouette representing a word in a very simple and very compact way. After having made a statistical study about the possible silhouettes of each character in the alphabet and their frequency, we use these to build all the possible silhouettes to associate with each lexicon word and their corresponding probabilities. The lexicon is organized as an "epigenetic" neural network. In the recognizing phase, when retrieving words corresponding to a given silhouette, this network will let us reach not only the words having been describred with this silhouette, but also those possessing a similar one and not having been described by it. The result is a list of sorted words ranked according to the relevance of this silhouette feature for each word. We also show how to combine different kinds of features with such networks. We next show how to introduce a confidence rate in the word outline by using continuous features. The theoretical results are very encouraging. Measurements are in progress on on-line texts.
منابع مشابه
On-Line Handwriting Recognition Using Hidden Markov Models
New global information-bearing features improved the modeling of individual letters, thus diminishing the error rate of an HMM-based on-line cursive handwriting recognition system. This system also demonstrated the ability to recognize on-line cursive handwriting in real time. The BYBLOS continuous speech recognition system, a hidden Markov model (HMM) based recognition system, is applied to on...
متن کاملCorrelation between handwriting characteristics
This study is part of an on-line editor project [1]. It intends to find correlations between handwriting characteristics, then to keep the most relevant ones to bring into light handwriting styles in order to improve further recognition. They can be used to extract global word features to reduce a lexicon like in [2] or select style based recognizers. A similar study has been carried out by JP ...
متن کاملReduction of Non Deterministic Automata for Hidden Markov Model Based Pattern Recognition Applications
Most on-line cursive handwriting recognition systems use a lexical constraint to help improve the recognition performance. Traditionally, the vocabulary lexicon is stored in a trie (automaton whose underlying graph is a tree). In a previous paper, we showed that non-deterministic automata were computationally more efficient than tries. In this paper, we propose a new method for constructing inc...
متن کاملFrom Off-line to On-line Handwriting Recognition
On-line handwriting includes more information on the time order of the writing signal and on the dynamics of the writing process than off-line handwriting. Therefore, on-line recognition systems achieve higher recognition rates. This can be concluded from results reported in the literature, and has been demonstrated empirically as part of this work. We propose a new approach for recovering the ...
متن کاملOn-Line Handwriting Recognition Based on Bigram Co-Occurrences
We propose a handwriting recognition method that utilizes the n-gram statistics of the English language. It is based on the linguistic property that very few pairs of English words share exactly the same letter bigrams. This property is exploited to bring context to the recognition stage and to avoid segmentation. The recognition is based on detecting bigram co-occurrences. Even with naive feat...
متن کامل